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Abstract 

Some aspects of the result of applying unit resolution on a CNF formula can be formalized as functions 
with domain a set of partial truth assignments. We are interested in two ways for computing such 
functions, depending on whether the result is the production of the empty clause or the assignment of 
a variable with a given truth value. We show that these two models can compute the same functions 
with formulae of polynomially related sizes, and we explain how this result is related to the CNF 
encoding of Boolean constraints. 
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', 1. Introduction 
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Oh' 1-1- Theoretical framework 

In this paper, we deal with Boolean variables, constraints, and assignments. Any assignment of 
\ a Boolean variable v is denoted either [v,0] or [v,l]. Given any set V of Boolean variables, an 

assignment on V is a set / of assignments of variables of V. I is said to be complete if it assigns 
exactly one value to any variable of V, partial if it assigns at most one value to any variable of V, 
and contradictory if there is a variable v £ V such that [v, 0] £ V and [v, 1] £ V. Unless otherwise 
stated, an assignment on V is supposed to be partial and not contradictory. The set of all possible 
q ■ non contradictory complete assignments on V will be denoted Ay, while the set of all possible non 

contradictory partial assignments on V will be denoted ly . 

Given any set V of Boolean variables, the term Boolean constraint on V will be used in its widest 
sense, i.e., any computational representation q of a satisfiability function h q with domain Ay and 
codomain {sat,unsat}. Given any complete assignment A £ Ay, any constraint q on V is said to be 
satisfied by A if h q (A) = sat, else it is said to be falsified by A. Given any partial assignment / £ ly, 
any constraint q on V is said to be satisfied (falsified, respectively) by I if and only if q is satisfied 
(falsified, respectively) by any A £ Ay such that I C A. 

In propositional logic, a literal is either a propositional variable v or its negation —<v. By convention, 
the truth values will be denoted as the Boolean values and 1. A clause is any disjunction of literals 
ujx V • • • V £Jfc, and a CNF formula is any conjunction of clauses c\ A • • • A c m . The size of a clause is its 
number of literals. The size of a formula is the sum of the sizes of its clauses. 

Literals, clauses, and CNF formulae can be considered as Boolean constraints: v is satisfied by [v, 1], 
-iv by [v, 0], a clause is satisfied if and only if at least one of its literals is satisfied, and a CNF formula 
is satisfied if and only if all its clauses are satisfied. 

The following conventions will be used in the rest of the paper: given any variable v, the assignment 
[v, 1] can be denoted [v], and the assignment [v,0] can be denoted [—*v]; any clause can be considered 
as a set of literals, and any formula can be considered as a set of clauses; for any set V of Boolean 
variables, lit(F) denotes the set of literals based on variables of V, namely Ll v& y{v, -iu}. 
Any CNF formula S is said to be satisfiable if and only if there exists a truth assignment which satisfies 
E. Sat is the problem of determining whether any arbitrary CNF formula E is satisfiable or not. Given 
any formula E with variables V , and any assignment / on V, E|j denotes the formula E Ar^igj (a;), 
i.e., the formula E where the clause (v) is added for each assignment of [v, 1] £ /, and the clause (—iv) 
is added for each assignment [v, 0] £ I. 

Introduced in fl4l |. unit resolution utilizes unit clauses to produce new variable assignments and, when 
applicable, to detect inconsistencies. For the purpose of this paper, its principle can be described as 
follows. Given any assignment /, a clause c is said to be a unit clause with respect to / if and only if 
I falsifies all the literals of c except for one literal u, which will be called the active literal of c. Given 
any CNF formula E, the unit resolution process starts from an empty set U of variable assignments, 
which is iteratively augmented by the active literals of unit clauses with respect to U, until either U 



becomes contradictory or no new literal can be inferred any more. The formula can then be simplified 
by removing any non-unit clause satisfied by U, as well as any literal falsified by U. The resulting 
formula E' is logically equivalent to E. If U is contradictory, then the empty clause belongs to E', 
implying that E is not satisfiable. Sat solvers Q use unit resolution to speed up the search for 
solutions or inconsistencies by reducing the number of decisions (binary nodes) in the search tree. 

1.2. Motivation 

Given any set V of prepositional variables, we are interesting in functions with domain D C Xy and 
codomain {yes,no}Lj which can account for some aspects of the result of applying unit resolution to 
a CNF formula: the empty clause is produced, or a given variable is assigned to 1, or it is assigned to 
0. In the scope of this report, these functions will be called matching functions. 

Given any formula E, and any set V of prepositional variables occurring in E, the inferences made by 
unit resolution can be modeled by the following matching functions: 

• The function /s : Xy t— > {yes, no} such that for any partial assignment / £ Xy, = yes if 
and only if applying unit resolution on E|j produces the empty clause. We will say that unit 
resolution computes this function by contradiction. 

• For any literal uj — Ht{V), the function <?£.cj : D u M> {yes, no}, where D u = {I 6 Xy : /s(/) = 
no}, such that for any partial truth assignment / £ D u , gj^ = yes if and only if applying 
unit resolution to E|/ infers [w]. We will say that unit resolution computes these functions by 
propagation. 

Knowing the matching functions that can be computed by contradiction - as well as the ones that 
can be computed by propagation - with a polynomial amount of clauses is crucial for the study of 
the CNF encodings of Boolean constraints. Given any set V of Boolean variables and any constraint 
q on V, a CNF encoding of q is any CNF formula E g (which can include variables not belonging to V) 
such that for any complete assignment A £ Ay, EJ^ is satisfiable if and only if A satisfies q. This 
property allows any constraint satisfiability problem to be solved using a SAT solver. 
Two interesting additional properties of CNF encodings have been reported as potentially improving 
the efficiency of solving the resulting SAT instances: 

1. Given any encoded constraint q, unit resolution detects any partial assignment which falsify q: 
from any such assignment, the empty clause is produced. For example, this property is studied 
in [l3| in the context of Boolean cardinality constraints. Such an encoding will be called upi 
(like unit propagation detects inconsistency) in the following. 

2. Given any encoded constraint q on variables V , unit resolution enforces the generalised arc 
consistency of q, i.e., for any partial assignment / £ Xy which does not falsify q, and any literal 
lj £ lit(V), if /U{[u;]} falsify q then [~>uj] is inferred. This criterion was introduced in |5j. Such 
an encoding will be called upac (like unit propagation restores generalized arc consistency) in 
the following. 

In most cases, only the encodings producing a formula of size polynomially related to the number of 
variables of the input constraint can be used in practice. They will be called polynomial encodings in 
the following. 

Let us consider a family Q of constraints on Boolean variables. For any constraint q of Q with variables 
V = {v\, . . . , v n }, let us define the inconsistency function of q as f q : Xy n> {yes, no} such that for 
any partial assignment / £ Xy, f q {I) = yes if and only I falsifies q. 

Clearly, the existence of a polynomial upi encoding for the constraints of Q depends on the existence of 
polynomially sized CNF formulae allowing unit resolution to compute by contradiction the inconsistency 
functions of the constraints of Q. 

Now, for any constraint q of Q with variables V = {vi, . . . ,v n }, and any literal ui £ lit(y), let us 
define the arc consistency functions of q as g q>UJ : D q i— > {yes, no}, D q = {I £ Xy : f q (I) = no}, such 
that for any partial assignment / £ D q , g q ^(I) = yes if and only if / U {[-iu;]} falsifies q. 
Clearly again, the existence of a polynomial upac encoding for the constraints of Q can be expressed as 
the existence of polynomially sized CNF formulae allowing unit resolution to compute by propagation 
the arc consistency functions related to the constraints of Q. 



1 Without loss of generality, these values have been chosen so as to avoid any ambiguity with the logical values true 
and false or the Boolean values and 1. 
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1.3. Contribution 

We show that any matching function can be computed by unit contradiction if and only if it can 
be computed by unit propagation, and that any family of matching functions can be computed in 
polynomial size (then in polynomial time) by unit contradiction if and only if it can be computed in 
polynomial size by unit propagation. 

As a corollary for any family Q of constraints with Boolean variables, if there exists a polynomial upi 
encoding for Q then there exists a polynomial upac encoding for Q. 

2. Technical results 

In this section, we will formalize and prove the previously presented results as the two following 
theorems. 

Theorem 1. Let f be any matching function. If f can be computed by propagation using a formula 
of size p, then f can be computed by contradiction with a formula of size p + 1 . 

Proof. Any CNF formula E p computing a matching function / by propagation can be reduced in the 
following way to a formula E c computing / by contradiction. 

Let / be a matching function with domain D C I v . Let E p be a CNF formula allowing unit resolution 
to compute / by propagation. This means that there is a literal uj such that for any / G £>, applying 
unit propagation to E p |/ cannot produce the empty clause, but infers [uj] if and only if /(/) = yes. 
Then the formula E c = E p A (~>uj) allows unit resolution to compute / by contradiction. □ 

Theorem 2. Let f be any matching function. If f can be computed by contradiction using a formula 
of size p with n variables, then f can be computed by propagation with a formula of size 0{pn 2 ). 

Proof. Any CNF formula E c computing a matching function / by contradiction can be reduced in 
the following way to a formula E p computing / by propagation. 

Let / be a matching function with domain Xy . Let E c be a CNF formula allowing unit resolution to 
compute / by contradiction. We will construct a formula E p such that for any / G Xy, applying unit 
resolution to E p |/ does not produce the empty clause, but assigns 1 to a new variable s if and only if 
applying unit resolution to E c |j produces the empty clause. As a manner of speaking, applying unit 
resolution on E p |/ simulates the effects of applying unit resolution to E c | / without ever producing the 
empty clause. 

To this end, the unit resolution process is decomposed into stages such that each stage i produces the 
assignments induced from the unit clauses with respect to the assignment of the stage i — 1, where 
the assignment of the stage is Iq = I. Let V be the set of variables of E c and n = \V\. Because 
the cardinal of any non contradictory assignment on V is at most n, the unit resolution process stops 
after at most n + 1 stages. 

Given any CNF formula E and any integer i, let U(E, i) denote the current assignment after i unit 
resolution stages on E. 

The formula E p contains 2(n + 1) + n variables, namely the variables of V and, for each literal 
to G l±t(V), (n + 1) new variables denoted x Uj x, . . . , x w ,n+i- It consists of the following clauses: 

1. for any v G V , (v V £-,u,i) and {—<v V x Vl i), which are called injection clauses; 

2. for any uj G lit(V) and any i G l..n, (~ , x u] _i V lo, which are called replication clauses; 

3. for any clause c of E c with at least two literals, any literal uj G c, and any i G l..n, Vpec\ {u} 
-iX-, Pl i), which are called deduction clauses. 

4. for any singleton clause (uj) of E, which are called unit clauses. 

Let us consider the following induction hypothesis H rn : for any ui G lit(V), [x Wjm ] G U(E p |/,m) if 
and only if [uj] G U(E c |/,m). 

For any uj G lit(V), [uj] G U(E c |/, 1) if and only if [uj] G / or (uj) G E c . In the first case, [x U) i] G 
U(E p |/, 1) thanks to the injection clause (-iw V In the second ] G U(E p |/, 1) thanks 

to the deduction clause (2^,1). Because only these clauses can infer [x^i] during the unit resolution 
process on E p |/, and because they can infer only in theses two cases, Hi holds. 

Now, suppose that H m holds for some m G l..n. and let us consider any literal ui G lit(V). Regarding 
the inference of [uj] by unit resolution on E c |/ at stage m + 1, three cases can be considered: 

1. [uj] G U(E c |/,m) and then [uj] G U(E c |/,m+ 1). By induction hypothesis, [x^m] G U(E p |/,m). 
Thanks to the replication clause (->x Utm , Vi^.m+i) of E p , [i Ujm +i] G U(E p |/, m + 1). See Figure 
IA.1I for a graphical illustration. 
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2. [lj] ^ U(S c |/,m) and [lj] G U(E c |/,m + 1). Then there is a clause (pi, V • • • V pk V ui) in E c such 
that all the assignments [-<pi] to [-i/Ofc] are in U(E c |j,m). By induction hypothesis, [x-, Pl _ rn ] to 
[x-, Pk ,m] are in U(E p |/, m). Thanks to the deduction clause (->x-, Pl , m V • • • V ->X-, Pfc)TO V x u ,m+i), 
[xu^m+i] G U(Sp|/,m + 1). See Figure lA~2l for a graphical illustration. 

3. [lj] U(E c |/,m) and [lj] U(E c |j,m + 1). The only clauses of E p that can infer [i Wjm+ i] are 
the replication clauses and the deduction clauses. By induction hypothesis, [i u , m ] ^ U(E P | j,m), 
then no replication clause can infer [x Wjm +i]. Secondly, because [lj] £ U(E c |j,m + 1), for any 
clause (pi, V • • • V pt V ui) in S c , not all the literals p\ to pk are falsified by f7(E c |j, m). By 
induction hypothesis, not all the assignments [x-, Pl)jn ] to [x-, Pfcim ] are in U(E p |j,m). Then, the 
corresponding deduction clause {~^x^ Pl , m V • • • V _i x-, Pfc . m V Xu>,m+i) of E p cannot infer [x WjTrt +i]. 

Hence H m holds for any m G l..(n + 1). Furthermore, because each unit resolution stage on E p |/ 
infers only positive literals, the empty clause is never produced. It follows that unit resolution on 
S c |/ produces the empty clause (or, equivalently, infers two opposite literals) if and only if there is a 
variable v G V such that [x„ in +i] and [x-,„ )7l +i] are inferred by unit resolution on E p |/. Now, let us 
complete the formula E p with the clauses (~ix V)n +i V ->v^ v ^ n+ i V s), for each v G V, where s is a new 
fresh variable. Clearly, unit resolution on E p |/ infers [s] if and only if unit resolution on E c |j produces 
the empty clause. For illustrative purposes, Figure IA.3I gives an example of how unit resolution on 
E p |j simulates unit resolution on E c |j. 

Let p be the size of E c and k be the size of the largest clause of E c . Without loss of generality, 
let us suppose that k < n (any larger clause would be a tautology). The formula E p includes 0(n) 
injection clauses, 0{n 2 ) replication clauses, 0(n 2 ) unit clauses, and O(np) deduction clauses. Because 
the largest clauses of E p , which are the deduction clauses, have size at most k, the size of E p is 
0(npk) = 0(n 2 p). □ 

As a corollary of theorem^ let Q be a family of contraints for which there exists a polynomial upi 
CNF encoding, and let us show how a polynomial upac encoding can be obtained. 
By hypothesis, for any constraint q G Q with variables V — {v\, . . . , v n }, there is a CNF formula E g of 
size polynomially related to n such that for any assignment IonV, unit resolution on E g |j produces 
the empty clause if and only if / falsifies q. A upac encoding Q q for q must verify the additional 
following property: for any literal uj G lit(V), and any assignment / G Hy such that / does not falsify 
q and [lj] ^ I, unit resolution on Cl q \i does not produce the empty clause, but infers [lj] if and only if 
I U {[-iw]} falsifies q. Such a behavior can be obtained thanks to the following formula: 

Q q = E g A wglit( y) (S g:W A (-iSo, V -iw)) 

Where each S 9jW is a formula allowing unit resolution to compute by propagation, with output variable 
s u , the contradiction function of the contraint E g A (lj). 

3. Related works 

There are at least three research directions related to the study of the expressive power of unit 
resolution. 

The first one aims to identify the classes of formulae for which unit resolution is a complete refutation 
procedure in the sense that it produces the empty clause if and only if the input formula is not 
satisfiable. For example, this property holds for the formulae containing only Horn clauses 01 • 
The second direction aims to characterize the complexity of determining whether a given formula can 
be refuted by unit resolution or not. This decision problem denoted UNIT is known to be P-complete, 
meaning that for any decision problem tt with polynomial time complexity, there exists a log space 
reduction from tt to UNIT [9|. Circuit value, which consists to determine the output value of a Boolean 
circuit, given its input values, is p-complete too [6]. Regarding the complexity theory, UNIT and circuit 
value have then the same expressive power. In the present paper, a different point of view is adopted. 
The CNF formula is not the input data of a program, but the program itself. The input data is a 
partial truth assignment encoded in a natural way, i.e., each input variable can be either assigned to 
0, assigned to 1, or not assigned. 

The third line is related to the search for efficient CNF encodings of various problems in order to solve 
them thanks to any SAT solver. Because unit resolution is implemented efficiently in SAT solvers, many 
works aim to find encoding schemes which allow unit resolution to make as many inferences as possible. 
In a CNF encoding for enumerative constraints is proposed, which allows unit propagation to make 
the same deductions on the resulting formula as restoring arc consistency on the initial constraints 
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does. This work was innovative because with the previously known encodings, unit propagation 
had less inference power than restoring arc consistency, which is the basic filtering method used in 
constraint solvers [lOj]. It has been followed by various similar works on other kinds of constraints 
such as Boolean cardinality constraints Hand pseudo-Boolean constraints for which polynomial 
upi and upac encoding are proposed. In [1;], a general way to construct a (possibly non-polynomial) 
upac encoding for any constraint is proposed. Today, it has become customary, when a new encoding 
is proposed, to address the question of the behavior of unit resolution on the obtained SAT instances. 
So far, upi and upac CNF encodings are known for enumerative constraints [5], Boolean cardinality 
constraints and pseudo-Boolean constraints [llj], but the research field remains open regarding, 
for example, arithmetic constraints Q or global cardinality constraints fl2l |. 

4. Concluding remarks and perspectives 

To the best of our knowledge, it is the first time that unit resolution is addressed as a computation 
model for functions with domain a set of partial assignments on Boolean variables. We believe that 
this model is appropriate to characterize the inference power of unit resolution in SAT solvers. By 
showing that unit contradiction has the same expressive power as unit propagation, we provide a 
theoretical insight into the field of encodings of constraint satisfaction problems into CNF for solving 
them thanks to SAT solvers. The underlying scientific issue is nothing less than determining the scope 
of application of SAT solvers: which problems can be reasonably addressed by SAT solvers, which 
cannot, and why ? 

We are currently working on the characterization of the matching functions that can be efficiently 
computed by unit resolution, and so the constraints for which there exist polynomial upi and upac 
encodings. The following step will be to look for a general method for translating - when applicable 
- algorithms or Boolean circuits into CNF formulae allowing unit resolution to compute the same 
matching functions. 

Appendix A. Graphical illustrations 

Here, we give some graphical illustration of the reduction described in the proof of the theorems 
presented section [21 
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